6 research outputs found

    Knowledge Extraction from Natural Language Requirements into a Semantic Relation Graph

    Get PDF
    Knowledge extraction and representation aims to identify information and to transform it into a machine-readable format. Knowledge representations support Information Retrieval tasks such as searching for single statements, documents, or metadata. Requirements specifications of complex systems such as automotive software systems are usually divided into different subsystem specifications. Nevertheless, there are semantic relations between individual documents of the separated subsystems, which have to be considered in further processes (e.g. dependencies). If requirements engineers or other developers are not aware of these relations, this can lead to inconsistencies or malfunctions of the overall system. Therefore, there is a strong need for tool support in order to detects semantic relations in a set of large natural language requirements specifications. In this work we present a knowledge extraction approach based on an explicit knowledge representation of the content of natural language requirements as a semantic relation graph. Our approach is fully automated and includes an NLP pipeline to transform unrestricted natural language requirements into a graph. We split the natural language into different parts and relate them to each other based on their semantic relation. In addition to semantic relations, other relationships can also be included in the graph. We envision to use a semantic search algorithm like spreading activation to allow users to search different semantic relations in the graph

    Automatische Duplikateliminierung in Aktivitätsdiagrammen von Fahrzeugfunktionen

    Get PDF
    The article may be also found at https://dl.gi.de/handle/20.500.12116/1039.Die Spezifikation von Fahrzeugfunktionen ist eine komplexe Aufgabe. Zum Umgang mit dieser Komplexität werden zur Beschreibung der Funktionen grafische Modellierungssprachen wie UML verwendet. Bei der Modellierung können Duplikate entstehen, welche Ausgangspunkt für Fehler und Inkonsistenzen in der weiteren Entwicklung sind. Dieser Beitrag widmet sich der Eliminierung von Duplikaten, welche bei der Spezifikation von Fahrzeugfunktionen mittels UML Aktivitätsdiagrammen auftreten. Es wird dargestellt, wie in UML Aktivitätsdiagrammen identifizierte Duplikate automatisiert eliminiert werden, ohne die ursprüngliche Funktionalität zu verändern. Mehrfach auftretende Elemente werden zusammengefasst und durch das Einfügen von weiteren Elementen und Verbindungen zusammengesetzt

    Knowledge Representation of Requirements Documents Using Natural Language Processing

    Get PDF
    Complex systems such as automotive software systems are usually broken down into subsystems that are specified and developed in isolation and afterwards integrated to provide the functionality of the desired system. This results in a large number of requirements documents for each subsystem written by different people and in different departments. Requirements engineers are challenged by comprehending the concepts mentioned in a requirement because coherent information is spread over several requirements documents. In this paper, we describe a natural language processing pipeline that we developed to transform a set of heterogeneous natural language requirements into a knowledge representation graph. The graph provides an orthogonal view onto the concepts and relations written in the requirements. We provide a first validation of the approach by applying it to two requirements documents including more than 7,000 requirements from industrial systems. We conclude the paper by stating open challenges and potential application of the knowledge representation graph

    Trace Link Recovery using Semantic Relation Graphs and Spreading Activation

    Get PDF
    Trace Link Recovery tries to identify and link related existing requirements with each other to support further engineering tasks. Existing approaches are mainly based on algebraic Information Retrieval or machine-learning. Machine-learning approaches usually demand reasonably large and labeled datasets to train. Algebraic Information Retrieval approaches like distance between tf-idf scores also work on smaller datasets without training but are limited in providing explanations for trace links. In this work, we present a Trace Link Recovery approach that is based on an explicit representation of the content of requirements as a semantic relation graph and uses Spreading Activation to answer trace queries over this graph. Our approach is fully automated including an NLP pipeline to transform unrestricted natural language requirements into a graph. We evaluate our approach on five common datasets. Depending on the selected configuration, the predictive power strongly varies. With the best tested configuration, the approach achieves a mean average precision of 40% and a Lag of 50%. Even though the predictive power of our approach does not outperform state-of-the-art approaches, we think that an explicit knowledge representation is an interesting artifact to explore in Trace Link Recovery approaches to generate explanations and refine results.Trace Link Recovery versucht, verwandte bestehende Anforderungen zu identifizieren und miteinander zu verknüpfen, um weitere technische Aufgaben zu unterstützen. Bestehende Ansätze basieren hauptsächlich auf algebraischem Information Retrieval oder maschinellem Lernen. Machine-Learning-Ansätze erfordern in der Regel relativ große und vorab klassifizierte Datensätze zum Trainieren. Algebraische Ansätze wie z.B. tf-idf funktionieren auch bei kleineren Datensätzen ohne Training, sind aber in der Bereitstellung von Erklärungen für Verknüpfungen begrenzt. In dieser Arbeit stellen wir einen Trace Link Recovery Ansatz vor, der auf einer expliziten Darstellung des Inhalts von Anforderungen durch einen semantischen Relationsgraphs basiert und die Ausbreitungsaktivierung verwendet, um Verknüpfungen über diesen Graphen zu identifizieren. Unser Ansatz ist vollständig automatisiert, einschließlich einer NLP-Pipeline zur Umwandlung uneingeschränkt natürlichsprachlicher Anforderungen in einen Graphen. Wir evaluieren unseren Ansatz anhand von fünf öffentlichen Datensätzen. Abhängig von der gewählten Konfiguration variiert die Performanz stark. Mit der am besten getesteten Konfiguration erreicht der Ansatz eine Genauigkeit von 40% und einen Lag von 50%. Auch wenn die Vorhersagekraft unseres Ansatzes dem Stand der Technik nicht überlegen ist, sind wir der Meinung, dass eine explizite Wissensrepräsentation ein interessantes Artefakt ist, das in Trace Link Recovery Ansätzen untersucht werden sollte, um Erklärungen zu generieren und die Ergebnisse zu verfeinern

    Removal of redundant elements within UML activity diagrams

    Get PDF
    As the complexity of systems continues to rise, the use of model-driven development approaches becomes more widely applied. Still, many created models are mainly used for documentation. As such, they are not designed to be used in following stages of development, but merely as a means of improved overview and communication. In an effort to use existing UML2 activity diagrams of an industry partner (Daimler AG) as a source for automatic generation of software artifacts, we discovered, that the diagrams often contain multiple instances of the same element. These redundant instances might improve the readability of a diagram. However, they complicate further approaches such as automated model analysis or traceability to other artifacts because mostly redundant instances must be handled as one distinctive element. In this paper, we present an approach to automatically remove redundant ExecutableNodes within activity diagrams as they are used by our industry partner. The removal is implemented by merging the redundant instances to a single element and adding additional elements to maintain the original behavior of the activity. We use reachability graphs to argue that our approach preserves the behavior of the activity. Additionally, we applied the approach to a real system described by 36 activity diagrams. As a result 25 redundant instances were removed from 15 affected diagrams

    Supporting the Development of Cyber-Physical Systems with Natural Language Processing: A Report

    Get PDF
    Software has become the driving force for innovations in any technical system that observes the environment with different sensors and influence it by controlling a number of actuators; nowadays called Cyber-Physical System (CPS). The development of such systems is inherently inter-disciplinary and often contains a number of independent subsystems. Due to this diversity, the majority of development information is expressed in natural language artifacts of all kinds. In this paper, we report on recent results that our group has developed to support engineers of CPSs in working with the large amount of information expressed in natural language. We cover the topics of automatic knowledge extraction, expert systems, and automatic requirements classification. Furthermore, we envision that natural language processing will be a key component to connect requirements with simulation models and to explain tool-based decisions. We see both areas as promising for supporting engineers of CPSs in the future
    corecore